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This  paper  will  demonstrate  the  use  of  SAS®  and  SAS/OR®  to  solve  a  long  standing  Army  problem  of  assigning 
ROTC  cadets  to  their  initial  basic  branch  (Infantry,  Armor,  etc).  The  paper  starts  with  a  problem  statement,  describes 
the  problem  as  a  network  optimization  and  then  shows  model  results  after  adding  each  of  the  constraints  from  the 
problem  statement.  The  paper  summarizes  with  comments  about  why  a  network  optimization  is  a  good  solution  for 
this  type  of  problem.  Finally,  the  paper  makes  the  assertion  that  SAS®  data  manipulation  and  statistics  procedures 
are  an  additional  benefit  provided  by  the  SAS/OR®  solution  not  found  elsewhere. 
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INTRODUCTION 

This  paper  shows  how  the  Army  could  optimize  the  assignments  of  Reserve  Officer  Training  Corp  (ROTC)  cadets  to 
their  initial  basic  branch  in  the  Army  using  a  network  optimization.  After  an  overview  of  the  business  problem  is 
provided,  we'll  implement  a  solution  using  the  NETFLOW  procedure  and  repeat  that  network  solution  using  the 
OPTMODEL  procedure.  The  OPTMODEL  implementation  will  be  extended  with  additional  constraints. 

In  this  paper,  we'll  solve  this  problem  one  step  at  a  time.  Initially,  we'll  just  assign  the  cadets  to  their  branches  based 
on  their  preferences  and  Army  branch  'demand',  then  add  constraints  and  assess  the  impact  of  these  constraints. 

PROBLEM  DETAILS 

The  following  facts  apply  to  the  ROTC  assignment  example  used  in  this  paper. 

We  start  with  a  'supply'  of  2545  cadets. 

Each  cadet  has  5  basic  branch  preferences,  gender,  and  an  Order  of  Merit  Score  (OMS)  attribute. 

The  Army  basic  branches  have  a  total  of  2545  assignments  (demands). 

The  Army  cannot  put  females  in  combat  arms  branches  (IN,  AR,  and  FA). 

The  Army  needs  to  proportionally  distribute  cadets  based  on  gender,  and  OMS. 

The  objective  of  this  model  is  to  maximize  cadet  satisfaction  while  still  meeting  Army  need. 

The  initial  problem  of  assigning  2545  cadets  to  basic  branches  with  maximum  cadet  satisfaction  (a  cadet  is  scored  as 
satisfied  if  they  get  their  1^,  2™  or  3'^'^  preference)  will  use  the  OMS  values,  cadet  preferences,  and  the  branch 
demands.  Subsequently,  constraints  related  to  gender  and  OMS  values  will  be  included. 

AN  INITIAL  MODEL 

An  example  of  data  input  for  this  model  may  add  clarity  and  is  shown  is  in  Figure  1 .  This  data  shows  the  Supply  side 
input  to  this  model,  with  P1-P5  being  that  cadet’s  Branch  preferences.  The  data  in  figure  2  shows  the  Army  Branch 
demand  for  these  cadets.  We  have  2545  cadets  and  branch  demand  also  sums  to  2545. 
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Figure  1,  Supply:  cadet  data  (5  of  2545)  ordered  by  OMS 

PROG  NETFLOW  takes  a  data  set  of  nodes  and  arcs  as  basic  input.  Given  the  data  represented  in  Figures  2  and  3, 
we  should  be  able  to  assemble  a  dataset  of  nodes  (both  supply  and  demand)  and  the  arcs  between  them.  The  node 
data  should  look  like  that  represented  in  first  column  of  Figure  4.  The  supply  nodes  are  positive  and  the  demand 
nodes  are  negative,  conservation  of  flow  should  sum  all  nodes  to  zero. 
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Branch  Name 

Branch  Code 

demand 

Air  Defense 

AD 

82 

Adjutant  Generai 

AG 

144 

Armer 

AR 

128 

Aviatien 

AV 

146 

Chemicai 

CM 

67 

Engineering 

EN 

189 

Fieid  Artiiiery 

FA 

233 

Finance 

Fi 

35 

Infantry 

iN 

282 

Military  Inteiligence 

Ml 

362 

Miiitary  Poiice 

MR 

100 

Ordinance 

OD 

208 

Quartermaster 

QM 

164 

Signai  Cerp 

SC 

255 

Transpertation  Corp 

TC 

150 

Tetal 

2545 

Figure  2,  Demand:  Branch  needs  data 


After  setting  up  the  basic  probiem,  the  chaiienge  beccmes  determining  a  methed  for  assigning  vaiues  to  arcs  (that 
connect  ROTC  cadets  to  assignments).  Reguiar  SAS®  data  step  procedures  wiil  support  this  niceiy.  Starting  with  ail 
the  arcs  (cadet  to  basic  branch  path,  one  per  branch  for  every  cadet),  ali  arcs  are  scored  as  a  1  and  the  score  is 
subsequentiy  adjusted  based  on  branch  preferences  and  OML  sceres.  This  scoring  iogic  is  shown  beiow  in  Figure  3. 

•  Ali  arcs  initiaiiy  scored  as  1 

•  Score  then  adjusted  for  preferences, 

if  1®*  preference  then  that  arc  is  given,  +5 
if  2"'“  then  +4,  if  3"^  then  +3, 
if  4th  then  +2,  if  5*'^  then  +1 . 

•  Score  then  adjusted  for  OMS  iike  so, 

new  Score  =  current  Score/OMS  ranking 

Figure  3,  Scoring  Logic 


The  Arc  Data  for  cadet  3  is  represented  in  the  right  side  of  Figure  4.  Note  that  the  arc  dataset  is  assembied  in  a  SAS 
Data  step  using  logic  that  buiids  the  arcs  from  the  nede  data  (essentiaiiy  nede  data  is  aii  frem  the  input  fiies)  and  the 
scoring  iogic  mentioned  in  Figure  3.  A  graphic  of  the  arcs  fcr  cadet  3  are  shown  in  Figure  5.  Thicker  iines  represent 
the  higher  scored  arcs  (here  scores  are  cailed  _cost_,  the  ‘defauit’  name  in  this  precedure). 
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Node  Data  (mostly,  for  cadet  3),  supply 
is  positive  and  demand  are  negative 

Arc  data  for  cadet  3  (with  scores  called  _cost_) 
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Figure  4,  Node  and  Arc  Data  Sets  for  Cadet  3 


15  Basic 
Branches 


Figure  5,  Node  and  Arc  Data  for  Cadet  3 
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A  standard  form  representation  of  this  probiem  is  beiow.  The  objective  function  in  Figure  6  (beiow  the  word 
maximize),  represents  38175  arcs  (if  written  out,  this  formuia  wouid  have  38175  variabies  and  coefficients).  The 
constraint  equations  (beiow  the  words  ‘subject  to’)  represent  2545  possibie  cadets  for  each  branch. 


maximize 


1.00*l_ad  + 
1.00* l_mi  + 
4.00*l_fa  + 
0 . 50*2_en  + 
0 . 50*2_sc  + 
0.33*3_ad  + 
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6.00*l_in  + 
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. 0011*2545_en  + 
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+  . 0019*2545_fi 


subject  too 


1_ 

_ad 

+ 

2_ 

_ad 

+ 

3. 

_ad 

+ 

4_ 

_ad 

+ 

5_ 

-ad 

+ 

6. 

-ad 

+  ... 

+ 

2543. 

-ad 

+ 

2544. 

-ad 

+ 

2545_ad 

= 

82 

1_ 

-ag 

+ 

2_ 

-ag 

+ 

3. 

-ag 

+ 

4_ 

-ag 

+ 

5_ 

-ag 

+ 

6. 

-ag 

+  ... 

+ 

2543. 

-ag 

+ 

2544. 

-ag 

+ 

2545_ag 

= 

144 

1_ 

+ 

2_ 

+ 

3. 

+ 

4_ 

_ar 

+ 

5_ 

-ar 

+ 

6. 

-ar 

+  ... 

+ 

2543. 

-Sr 

+ 

2544. 

-Sr 

+ 

2545_ar 

= 

128 

1_ 

_av 

+ 

2_ 

_av 

+ 

3. 

_av 

+ 

4_ 

_av 

+ 

5_ 

-av 

+ 

6. 

-av 

+  ... 

+ 

2543. 

-av 

+ 

2544. 

-av 

+ 

2545_av 

= 

146 

1_ 

_cm 

+ 

2_ 

_cm 

+ 

3. 

_cm 

+ 

4_ 

_cm 

+ 

5_ 

_cm 

+ 

6_ 

_cm 

+  ... 

+ 

2543. 

_cm 

+ 

2544. 

_cm 

+ 

254  5_cm 

= 

67 

1_ 

_en 

+ 

2_ 

_en 

+ 

3. 

_en 

+ 

4_ 

_en 

+ 

5_ 

-Sn 

+ 

6_ 

-Cn 

+  ... 

+ 

2543. 

-Cn 

+ 

2544. 

-Cn 

+ 

2545_en 

= 

189 

1_ 

_f  a 

+ 

2_ 

_f  a 

+ 

3. 

_fa 

+ 

4_ 

-fa 

+ 

5_ 

-fa 

+ 

6. 

-fa 

+  ... 

+ 

2543. 

-fa 

+ 

2544. 

-fa 

+ 

2545_fa 

= 

233 

1_ 

+ 

2_ 

+ 

3. 

+ 

4_ 

+ 

5_ 

+ 

6. 

+  ... 

+ 

2543. 

_fi 

+ 

2544. 

_fi 

+ 

2545_fi 

= 

35 

1_ 

_in 

+ 

2_ 

_in 

+ 

3. 

_in 

+ 

4_ 

-in 

+ 

5_ 

-in 

+ 

6. 

-in 

+  ... 

+ 

2543. 

-in 

+ 

2544. 

-in 

+ 

2545_in 

= 

282 

1_ 

_mi 

+ 

2_ 

_mi 

+ 

3. 

_mi 

+ 

4_ 

-mi 

+ 

5_ 

_mi 

+ 

6. 

-mi 

+  ... 

+ 

2543. 

-mi 

+ 

2544. 

-mi 

+ 

254  5_mi 

= 

362 

1_ 

_mp 

+ 

2_ 

_mp 

+ 

3. 

_mp 

+ 

4_ 

-trip 

+ 

5_ 

-tup 

+ 

6. 

-tup 

+  ... 

+ 

2543. 

_mp 

+ 

2544. 

_mp 

+ 

254  5_mp 

= 

100 

1_ 

_od 

+ 

2_ 

_od 

+ 

3. 

_od 

+ 

4_ 

-Od 

+ 

5_ 

-Od 

+ 

6. 

-Od 

+  ... 

+ 

2543. 

-Od 

+ 

2544. 

-Od 

+ 

2545_od 

= 

208 

1_ 

_qm 

+ 

2_ 

_qm 

+ 

3. 

_qm 

+ 

4_ 

-qm 

+ 

5_ 

-qm 

+ 

6_ 

-qm 

+  ... 

+ 

2543. 

-qm 

+ 

2544. 

-qm 

+ 

254  5_qm 

= 

164 

1_ 

_sc 

+ 

2_ 

_sc 

+ 

3. 

_sc 

+ 

4_ 

-SC 

+ 

5_ 

-SC 

+ 

6. 

-SC 

+  ... 

+ 

2543. 

-SC 

+ 

2544. 

-SC 

+ 

2545_sc 

= 

255 

1_ 

_tc 

+ 

2_ 

_tc 

+ 

3. 

_tc 

+ 

4_ 

-tc 

+ 

5_ 

-tc 

+ 

6. 

-tc 

+  ... 

+ 

2543. 

-tc 

+ 

2544. 

-tc 

+ 

2545_tc 

= 

150 

subject  too 

l_ad  +  l_ag  +  l_ar  +  l_av  +  l_cm  +  l_en  +  l_fa  +  l_fi  +  l_in  +  l_mi 
2_ad  +  2_ag  +  2_ar  +  2_av  +  2_cm  +  2_en  +  2_fa  +  2_fi  +  2_in  +  2_mi 
3_ad  +  3_ag  +  3_ar  +  3_av  +  3_cm  +  3_en  +  3_fa  + 


l_mp  +  l_ms  +  l_od 
2_mp  +  2_ms  +  2_od 


l_qm  +  l_sc  +  l_tc  =  1 
2_qm  +  2_sc  +  2_tc  =  1 


2545_ad  +  2545_ag  +  2545_ar  +  2545_av  +  2545_cm  +  2545_en  +  2545_fa  +  2545_fi  +  2545_in  +  2545_mi  +  2545_mp  +  2545_ms  + 
2545_od  +  2545_qm  +  2545_sc  +  2545_tc  =  1 


Figure  6,  Standard  form 


The  cadet  characteristics  of  branch  preference  and  OML  scores  are  used  to  assign  coefficients  for  the  objective 
function,  and  the  vaiues  in  the  constraints  variabies  are  either  0  or  1 . 

Now  that  we  have  datasets  for  nodes  and  arcs,  we  are  aii  set  to  run  a  PROG  NETFLOW. 

proc  netflow  maximize 
nodedat  a=node_dat  a 
arcdata=arc_data 
arcout=  sol; 
reset  maxitl  =  30000; 
run; 

Figure  7,  PROC  NETFLOW 

We  aiso  specify  an  output  dataset  (arcout)  and  a  maximum  number  of  iterations  (reset  maxitl  =  30000)  as  shown  in 
Figure  7. 

The  log  file  lines  from  this  NETFLOW  procedure  are  shown  below  and  you  can  see  that  an  optimum  solution  was 
found. 


NOTE:  Number  of  iterations  performed  (neglecting  any  constraints ) =  10742  . 
NOTE:  Of  these,  7944  were  degenerate. 

NOTE:  Optimum  (neglecting  any  constraints)  found. 

NOTE:  Maximal  total  cost=  48.945545982  . 

NOTE:  The  data  set  WORK38175 . SOL  has  38175  observations  and  14  variables 

Figure  8,  Log  file  output  showing  PROC  NETFLOW  output. 
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AN  INITIAL  SOLUTION 


Results  for  cadet  3  (from  the  sol  dataset  specified  by  the  arcout  clause)  show  that  the  cadet  has  a  ‘KEY_ARC  BASIC’ 
which  assigns  that  cadet  to  the  AV  branch. 


from 

to 

cost 

caoac 

lo 

name 

SUPPLY 

DBVIAND 

FLOW 

ROOST 

Foosr 

ANUMB 

TNUMB 

STATUS 

3 

ad 

0.33333 

1 

0 

3_ad 

1 

82 

0 

0 

-1.66 

11836 

33 

LOWffBD  NONBASC 

3 

ag 

0.33333 

1 

0 

3_ag 

1 

144 

0 

0 

-1.66 

13594 

33 

LCWBTBD  NONBASC 

3 

ar 

1 

1 

0 

3_ar 

1 

128 

0 

0 

-1 

34688 

33 

LO\A/BTBD  NONBASC 

3 

av 

2 

1 

0 

3_av 

1 

146 

1 

2 

36445 

33 

KECATC  BASC 

3 

cm 

0.33333 

1 

0 

3_cm 

1 

67 

0 

0 

-1.66 

15352 

33 

LOWffBD  NONBASC 

3 

en 

0.33333 

1 

0 

3_en 

1 

189 

0 

0 

-1.66 

17110 

33 

LCWBTBD  NONBASC 

3 

fa 

0.33333 

1 

0 

3_fa 

1 

233 

0 

0 

-1.66 

18868 

33 

LO\A/BTBD  NONBASC 

3 

fi 

0.33333 

1 

0 

3_fi 

1 

35 

0 

0 

-1.66 

20625 

33 

LDWffBD  NONBASC 

3 

in 

0.33333 

1 

0 

3_in 

1 

281 

0 

0 

-1.67 

22383 

33 

LCWBTBD  NONBASC 

3 

mi 

0.66667 

1 

0 

3_mi 

1 

363 

0 

0 

-1.33 

32930 

33 

LOWBTBD  NONBASC 

3 

mp 

0.33333 

1 

0 

3  mp 

1 

100 

0 

0 

-1.66 

24140 

33 

LOWBTBD  NONBASC 

3 

od 

0.33333 

1 

0 

3_od 

1 

208 

0 

0 

-1.66 

25898 

33 

LCWBTBD  NONBASC 

3 

qm 

0.33333 

1 

0 

3_qm 

1 

164 

0 

0 

-1.66 

27656 

33 

LDWBTBD  NONBASC 

3 

sc 

0.33333 

1 

0 

3_sc 

1 

255 

0 

0 

-1.66 

29414 

33 

LCWBTBD  NONBASC 

3 

tc 

0.33333 

1 

0 

3_tc 

1 

150 

0 

0 

-1.66 

31172 

33 

LOWBTBD  NONBASC 

Figure  9,  arc  data  from  sol  dataset 
( _cost_  column  is  what  we  are  maximizing) 


Assignment  results,  shown  below  In  Figure  10,  are  for  the  first  and  last  cadets  (highest  and  lowest  QMS).  You  can 
see  that  the  _cost_  variable  Is  used  to  maximize  cadet  satisfaction.  The  more  cadets  get  their  1®*  choice,  the  larger 
the  aggregate  solution.  This  solution  has  a  maximum  cost  =  48.945545982,  from  the  log  file  output  of  Figure  8. 


RANK 

QMS 

cost 

assigned 

BR1 

BR2 

BR3 

BFW 

BF5 

choice 

1 

M 

97.2110 

6 

IN 

IN 

BSI 

FA 

MP 

9C 

1st 

2 

F 

96.8591 

3 

AV 

AV 

AG 

R 

MS 

Ml 

1st 

3 

M 

96.6930 

2 

AV 

AV 

MS 

AR 

Ml 

IN 

1st 

4 

M 

96.5456 

1.5 

AV 

AV 

IN 

BSI 

FA 

AR 

1st 

5 

M 

96.2215 

1.2 

IN 

IN 

Ml 

BSI 

MP 

AR 

1st 

6 

M 

95.9808 

1 

BSI 

BSI 

Ml 

IN 

AR 

MS 

1st 

7 

M 

95.5532 

0.85714286 

MP 

MP 

BSI 

Ml 

AR 

FA 

1st 

2536 

M 

65.3007 

0.00039432 

CD 

IN 

AR 

BSI 

Ml 

FA 

Other 

2537 

M 

64.9649 

0.002365 

TC 

TC 

AD 

FA 

9C 

Ml 

1st 

2538 

M 

64.5997 

0.00236407 

AD 

AD 

AR 

Ml 

FA 

IN 

1st 

2539 

M 

64.5783 

0.00236314 

QM 

QM 

SC 

AR 

TC 

Bsl 

1st 

2540 

M 

64.3637 

0.0023622 

SC 

9C 

Ml 

FA 

MP 

QD 

1st 

2541 

M 

64.1470 

0.00196773 

FA 

MP 

FA 

AD 

Bsl 

CM 

2nd 

2542 

M 

63.8208 

0.00236035 

QM 

QM 

SC 

QD 

FA 

AR 

1st 

2543 

M 

63.6510 

0.00235942 

TC 

TC 

FA 

QM 

AD 

MP 

1st 

2544 

M 

63.1117 

0.00235849 

CM 

CM 

AR 

IN 

Ml 

MP 

1st 

2545 

M 

59.7413 

0.00235756 

AD 

AD 

R 

AR 

Bsl 

Ml 

1st 

Figure  10,  Cadet  Assignments  compared  to  cadet  preference. 


The  Demand  objectives  are  filled  by  this  solution  and  Cadet  Satisfaction  is  maximized  as  shown  in  Figure  1 1  below. 


Demand  (Goal) 

Cadet  Satisfaction 
(1^’,  2"'‘,  or  3"*  preference  matched) 

BR 

ASSIGNED 

PERCENT 

GOAL 

AD 

82 

3.222 

82 

AG 

144 

5.6582 

144 

AR 

128 

5.0295 

128 

AV 

146 

5.7367 

146 

choice 

COUNT 

PERCENT 

CUM 

CM 

67 

2.6326 

67 

1st 

1572 

61.768 

61.768 

EN 

189 

7.4263 

189 

2nd 

640 

25.147 

86.916 

FA 

233 

9.1552 

233 

3rd 

253 

9.941 

96.857 

FI 

35 

1.3752 

35 

4th 

41 

1.611 

98.468 

IN 

281 

11.0413 

281 

5th 

14 

0.550 

99.018 

Ml 

363 

14.2633 

363 

other 

25 

0.982 

100.000 

MP 

100 

3.9293 

100 

OD 

208 

8.1729 

208 

QM 

164 

6.444 

164 

SC 

255 

10.0196 

255 

TC 

150 

5.8939 

150 

Figure  11,  Assignment  results  summary 

96.857%  cadet  satisfaction  is  pretty  good,  especially  when  we  are  exactly  matching  the  Army  branch  need  (Demand), 
but  we  have  ignored  the  additional  constraints  regarding  gender  and  combat  arms,  and  quality. 

The  constraint  that  prohibits  females  from  being  assigned  to  the  Combat  Arms  (IN,  FA,  and  AR)  branches  can  be 
Implemented  In  NETFLOW  Procedure  by  simply  deleting  the  arcs  between  female  cadets  and  those  three  branches 
from  the  arcs  dataset.  What  happens  If  you  wanted  to  add  constraints  to  proportionally  distribute  females  to  the 
remaining  branches?  You  can  see  from  Figure  12  below  that  the  distribution  of  cadets  by  1®*  preference  when 
compared  to  the  distribution  of  cadets  after  the  demand  constraint  is  applied.  The  1®*  preference  distribution  (on  the 
left)  makes  the  AG  branch  almost  50%  female.  A  proportional  distribution  by  gender  constraint  could  be  used  to 
make  all  permitted  branches  more  representative  of  the  whole  population. 


Counts  per  branch  all  preferences 


assigned 


Figure  12,  Cadet  Preference  compared  to  cadet  assignment. 
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The  NETFLOW  procedure  supports  applying  proportional  distribution  constraints  (via  ‘intermediate  nodes’)  and 
subsequently  applying  demand  side  constraints  {‘side  constraints’),  but  the  OPTMODEL  syntax  better  supports  the 
application  of  many  proportional  distribution  and  demand  side  constraints. 


AN  OPTMODEL  SOLUTION 

An  OPTMODEL  version  of  this  network  model  will  be  developed  before  we  add  any  more  constraints.  The 
OPTMODEL  procedure  adds  language  to  support  the  Operations  Research  (OR)  Math  Programming  Language  style 
of  modeling  using  index  sets  and  associated  logic  constructs.  The  data  sets  of  nodes  from  the  prior  model  will  be 
immediately  read  Into  Index  Sets,  we’ll  create  an  arcs  array  and  we’ll  re-run  the  model  above.  An  example  of  running 
input  data  into  a  data  set,  then  an  index  set  is  shown  below.  To  create  the  cadets  and  goals  data  sets,  read  In  the 
cadets.csv  file  and  the  goals.txt  file 

filename  cadet  /ins/cadets . csv' ; 
filename  goal  '. /ins/goal . txt ' ; 

/*  read  in  the  cadet  data  */ 
data  cadets_loaded; 

infile  cadet  DLM= ' , '  DSD  MISSOVER; 

input  QMS  $  SEX  $  RACE  $  prevbr  $  BRl  $  BR2  $  BR3  $  BR4  $  BR5  $  acd  $  ; 
if  _N_  >  1 ; 

omsnbr  =  input (OMS, 9 . ) ; 
ac  =  input ( acd, 9 . ) ; 

RCE=' 1 ' ; 

if  RACE  ne  then  RCE='2'; 

run ; 

proc  sort  data=cadets_loaded; 
by  descending  omsnbr; 

data  cadets; 

set  cadets_loaded; 
rank=_N_; 
run ; 

/*  this  reads  in  the  demand  data  goals.txt  */ 
data  goalst (drop=goal ) ; 
infile  goal  missover; 
input  BR  $1-2  goal  $  ; 
goalnbr  =  input (goal, 9 .) ; 
run ; 


Then  do  some  sql  and  data  stepping  to  expand  the  goals  dataset  to  support  additional  constraints  and  some 
calculations. 


proc 

sql; 

select 

proc 

sql; 

select 

proc 

sql; 

select 

proc 

sql; 

select 

proc 

sql; 

select 

proc 

sql; 

select 

proc 

sql; 

select 

proc 

sql; 

select 

proc 

sql; 

select 

proc 

sql; 

select 

proc 

sql; 

select 

proc 

sql; 

select 

proc 

sql; 

select 

(  '  IN' 

'  ,  ' AR ’ 

'  ,  'FA' ) 

:ount(*)  as  cnt  into  :all_cnt  from  cadets;  quit; 

:ount(*)  into  : fem_cnt  from  cadets  where  SEX= ' F ' ;  quit; 
:ount(*)  into  :al_cnt  from  cadets  where  ac  =  1  ;  quit; 

:ount(*)  into  : a2_cnt  from  cadets  where  ac  =  2  ;  quit; 

:ount(*)  into  : a3_cnt  from  cadets  where  ac  =  3  ;  quit; 

:ount(*)  into  :a4_cnt  from  cadets  where  ac  =  4  ;  quit; 

:ount(*)  into  :min_cnt  from  cadets  where  RACE  ne  ' 
:ount(*)  into  :min2_cnt  from  cadets  where  RACE  eq 

:ount(*)  into  :min3_cnt  from  cadets  where  RACE  eq 

:ount(*)  into  :min4_cnt  from  cadets  where  RACE  eq 

:ount(*)  into  :min5_cnt  from  cadets  where  RACE  eq 

ivg (omsnbr)  into  : avg_oms  from  cadets;  quit; 

;um(goalnbr)  as  cnt  into  : ca_cnt  from  goalst  where 
quit  ; 


1'  ; 

quit  ; 

'  2  ' 

;  quit ; 

'3  ' 

;  quit ; 

'  4  ' 

;  quit ; 

'5' 

;  quit ; 

BR 

in 

data  goals; 
set  goalst; 

femgoals=round (goalnbr  *  &f em_cnt/ ( &all_cnt-&ca_cnt ) , 1 ) ; 
if  BR  in  ('IN'  'AR'  'FA')  then  femgoals  =  0; 
mingoals=round (goalnbr  *  &min_cnt/&all_cnt, 1 ) ; 
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min2goals=round (goalnbr  *  &min2_cnt/&all_cnt, 1 ) ; 
min3goals=round (goalnbr  *  &min3_cnt/&all_cnt, 1 ) ; 
min4goals=round (goalnbr  *  &min4_cnt/&all_cnt, 1 ) ; 
min5goals=round (goalnbr  *  &min5_cnt/&all_cnt, 1 ) ; 
algoals=round (goalnbr  *  &al_cnt/&all_cnt , 1 ) ; 
a2goals=round (goalnbr  *  &a2_cnt/&all_cnt , 1 ) ; 
a3goals=round (goalnbr  *  &a3_cnt/&all_cnt , 1 ) ; 
a4goals=round (goalnbr  *  &a4_cnt/&all_cnt, 1 ) ; 
qgoals=0 . 998*goalnbr*&avg_oms ; 
run ; 

Call  the  OPTMODEL  procedure  and  read  the  cadets  and  the  goals  data  sets  into  index  sets  (load  ‘nodes’  in  network 
terms),  oreate  an  ares  index  set,  and  declare  an  index  set  of  decision  variables  x.  You  can  check  the  index  set 
contents,  by  looking  into  your  output  file  for  the  results  of  the  print  statements. 

proc  optmodel  printlevel=2 ; 
set  C ; 

string  OMS { C } ; 
string  SEX { C } ; 
string  RACE { C } ; 
string  prevbr{C}; 
string  BRl { C } ; 
string  BR2 { C } ; 
string  BR3 { C } ; 
string  BR4 { C } ; 
string  BR5 { C } ; 
number  omsnbr{C}; 
number  ac { C } ; 
string  RCE { C } ; 
number  rank { C } ; 

read  data  cadets  into  C= [_N_] OMS  SEX  RACE  prevbr 

BRl  BR2  BR3  BR4  BR5  omsnbr  ac  RCE  rank; 

print  OMS  SEX  RACE  prevbr  BRl  BR2  BR3  BR4  BR5  omsnbr  ac  RCE  rank; 

set  B ; 

string  BR{ B } ; 
number  goalnbr {B}; 
number  femgoals{B}; 
number  mingoals{B}; 
number  min2goals { B } ; 
number  minSgoals { B } ; 
number  min4goals { B } ; 
number  minSgoals { B } ; 
number  algoalsfB}; 
number  a2goals{B}; 
number  a3goals{B}; 
number  a4goals{B}; 
number  qgoals{B}; 

read  data  goals  into  B= [_N_]  BR  goalnbr  femgoals  mingoals 
min4goals  minSgoals  min2goals  minSgoals  algoals 
a2goals  aSgoals  a4goals  qgoals; 

print  BR  goalnbr  femgoals  mingoals  min4goals  minSgoals  minSgoals 
minSgoals  algoals  aSgoals  aSgoals  a4goals  qgoals; 

Create  the  arcs  index  set  and  declare  an  x  decision  variable. 

number  arc{B,C}; 

for{i  in  B,  j  in  C} 

if  BR[i]=BRl[j]  then  arc[i,j]  =  S/rank[j]; 
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else  if  BR[i]=BR2[j]  then  arc[i,j]  =  4/rank[j]; 

else  if  BR[i]=BR3[j]  then  arc[i,j]  =  3/rank[j]; 

else  if  BR[i]=BR4[j]  then  arc[i,j]  =  2/rank[j]; 

else  if  BR[i]=BR5[j]  then  arc[i,j]  =  l/rank[j]; 

else  arc [ i , j ] =0 ; 

/*  print  arc;  */ 

Create  subsets  of  the  cadets  index  set,  like  all  females,  or  academic  degree  type  1 ,  or  all  redcat  categories. 
Generally  any  attribute  of  the  cadet  population  that  you  may  want  to  proportionally  distribute,  can  be  taken  into  a 
subset  here  with  the  setof  function  supported  by  OPTMODEL. 

set  CA  =  setof {i  in  B:  BR[i]  eq  'IN'  OR  BR [ i ]  eq  'AR'  OR  BR[i]  eq  'FA'}<i>; 
set  NCA  =  setof {i  in  B:  BR[i]  ne  'IN'  AND  BR[i]  ne  'AR'  AND  BR[i]  ne  'FA'}<i>; 
printfi  in  CA}  BR[i]; 

printfi  in  NCA}  BR [ i ] ; 

var  x{B,C}  >=  0; 

Using  the  decision  variables  in  the  x  index  set  and  the  scores  in  the  arcs  index  set  create  a  maximization  equation. 

maximize  total_score  =  sumfi  in  B,  j  in  C } arc [ i, j ] *x [ i, j ] ; 
constraint  supply} j  in  C}:  sumfi  in  B}x[i,j]=l; 

constraint  demandfi  in  B} :  sumfj  in  C } x [ i, j ] >=goalnbr [ i ] ; 

constraint  fems_no_ca{i  in  CA} :  sum{j  in  F}x[i,j]=0; 

/*constraint  fcadem{i  in  B}:  sumfj  in  F } x [ i, j ] >=f emgoals [ i ] ; 

constraint  qualityfi  in  B}:  sum{j  in  C } omsnbr [ j ] *x [ i, j ] >=qgoals [ i ] ; 
constraint  mindem{i  in  B}:  sumfj  in  M} x [ i, j ] >=mingoals [ i ] ; 
constraint  min2dem{i  in  B}:  sum{j  in  R2 } x [ i, j ] >=min2goals [ i ] ; 

constraint  min3dem{i  in  B}:  sum{j  in  R3 } x [ i, j ] >=min3goals [ i ] ; 

constraint  min4dem{i  in  B}:  sum{j  in  R4 } x [ i, j ] >=min4goals [ i ] ; 

constraint  minSdemfi  in  B}:  sum{j  in  R5 } x [ i, j ] >=min5goals [ i ] ; 

constraint  altypefi  in  B} :  sumfj  in  A1 } x [ i , j ] >=algoals  [  i ] ; 

constraint  a2type{i  in  B} :  sumfj  in  A2 } x [ i , j ] >=a2goals [ i ] ; 

constraint  a3type{i  in  B} :  sumfj  in  A3 } x [ i , j ] >=a3goals  [  i ] ; 

constraint  a4type{i  in  B} :  sumfj  in  A4 } x [ i , j ] >=a4goals  [  i ] ; */ 

solve ; 

Take  the  x  set  of  decision  variables  that  are  greater  then  zero  and  place  the  assignment,  rank  (here  cadet  number), 
arc  scores,  etc...  and  place  those  values  in  a  data  set  called  solt. 

/*  print  x;  */ 

create  data  solt  from  [B  C]={i  in  B,  j  in  C:  x[i,j]>0}  BR[i]  rank[j]  arc[i,j] 
ac[j]  SEX [ j ]  RACE [ j ]  RCE[j]  prevbr [ j ]  BRl [ j ]  BR2 [ j ]  BR3 [ j ] 

BR4 [ j ]  BR5[j]  omsnbr [j]; 

quit  ; 

The  data  set  solt  (temporary  solution)  is  now  available  to  support  analysis  and  expansion.  The  code  below 
categorizes  the  result  and  assigns  a  random  number  to  each  assignment  to  support  disambiguation  of  results. 

data  sol; 

set  solt (rename= {BR=assigned  arc=score) ) ; 
choice  =  '6other'; 

if  assigned  =  brl  then  choice  =  '1st'; 


else 

if 

assigned  =  br2 

then 

choice  = 

'2nd'  ; 

else 

if 

assigned  =  br3 

then 

choice  = 

'3rd'  ; 

else 

if 

assigned  =  br4 

then 

choice  = 

'  4th'  ; 

else 

if 

assigned  =  brS 

then 

choice  = 

'5th'  ; 

_random  =  RANUNIfO); 
rnd  =  put (_random,  1 0 . 8 )  ; 
rnk  =  put (rank,  6.); 
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obs  =  put (_N_, 6 . ) ; 
scorec=put (score, 10.8) ; 
run  ; 


Some  output  from  this  data  set  is  shown  below  in  Figure  13  and  you  can  see  that  the  top  ranked  cadets  are  generally 
getting  their  1®*  choice  assigned  to  them,  demand  is  met,  and  satisfaction  is  96.817%. 


Assignments  Listing,  the  first  10  rows 

B 

C 

assigned 

rank 

score 

ac 

SEX 

OMS 

BRl 

BR2 

BR3 

BR4 

BR5 

choice 

Rnd 

9 

1 

IN 

1 

5 

1 

M 

97.211 

IN 

EN 

FA 

MP 

SC 

1st 

0.746979 

4 

2 

AV 

2 

2.5 

2 

F 

96.8591 

AV 

AG 

FI 

MS 

Ml 

1st 

0.749089 

4 

3 

AV 

3 

1.66667 

2 

M 

96.693 

AV 

MS 

AR 

Ml 

IN 

1st 

0.776854 

4 

4 

AV 

4 

1.25 

4 

M 

96.5456 

AV 

IN 

EN 

FA 

AR 

1st 

0.691201 

9 

5 

IN 

5 

1 

1 

M 

96.2215 

IN 

Ml 

EN 

MP 

AR 

1st 

0.729527 

6 

6 

EN 

6 

0.83333 

2 

M 

95.9808 

EN 

Ml 

IN 

AR 

MS 

1st 

0.748766 

12 

7 

MP 

7 

0.71429 

1 

M 

95.5532 

MP 

EN 

Ml 

AR 

FA 

1st 

0.87659 

9 

8 

IN 

8 

0.625 

4 

M 

95.2922 

IN 

AR 

EN 

Ml 

OD 

1st 

0.979897 

6 

9 

EN 

9 

0.55556 

1 

M 

94.8416 

EN 

FI 

Ml 

SC 

AR 

1st 

0.861697 

10 

10 

Ml 

10 

0.5 

1 

F 

94.7545 

Ml 

MS 

EN 

FI 

AD 

1st 

0.724443 

Demand  (Goal) 

Cadet  Satisfaction  (l“  2"’', 

or  3'“'  preference  matched) 

BR 

ASSIGNED 

PERCENT 

GOAL 

AD 

82 

3.222 

82 

AG 

144 

5.6582 

144 

AR 

128 

5.0295 

128 

AV 

146 

5.7367 

146 

CHOICE 

COUNT 

SATISFIED 

CUMULATIVE 

CM 

67 

2.6326 

67 

1st 

1569 

61.6503 

61.65 

EN 

189 

7.4263 

189 

2nd 

644 

25.3045 

86.955 

FA 

233 

9.1552 

233 

3rd 

251 

9.8625 

96.817 

FI 

35 

1.3752 

35 

4th 

41 

1.611 

98.428 

IN 

281 

11.0413 

281 

5th 

14 

0.5501 

98.978 

Ml 

363 

14.2633 

363 

6other 

26 

1.0216 

100 

MP 

100 

3.9293 

100 

OD 

208 

8.1729 

208 

QM 

164 

6.444 

164 

SC 

255 

10.0196 

255 

TC 

150 

5.8939 

150 

Figure  13,  Assignments  summary  from  PROC  OPTMODEL 


Now  let’s  add  in  some  more  constraints.  An  example  of  a  proportional  distribution  constraint  is  shown  in  the  code 
above  (page  9)  and  below.  To  enable  the  female  proportional  distribution  constraint  in  the  problem  above,  we 
uncomment  this  constraint  and  re-run  the  model. 

constraint  fcademfi  in  B}:  sum{j  in  F } x [ i , j ] >=f emgoals [ i ] ; 

Note  that  the  FGOAL  numbers  are  calculated  by  the  counting  sql  statements  in  the  first  section  of  the  code  above 
where  we  loaded  the  text  files  into  datasets  and  extended  those  datasets.  The  specific  lines  that  calculated  femgoals 
are  shown  below. 

proc  sql;  select  count (*)  as  cnt  into  :all_cnt  from  cadets;  quit; 
proc  sql;  select  count (*)  into  : fem_cnt  from  cadets  where  SEX='F';  quit; 
proc  sql;  select  sum(goalnbr)  as  cnt  into  : ca_cnt  from  goalst  where  BR  in 
( ' IN' , 'AR' , 'FA' ) ;  quit; 

data  goals; 
set  goalst; 

femgoals=round (goalnbr  *  &f em_cnt/ ( &all_cnt-&ca_cnt ) , 1 ) ; 
if  BR  in  ('IN'  'AR'  'FA')  then  femgoals  =  0;  ... 
run ; 
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Some  output  from  this  run  is  shown  below  in  Figure  14  and  you  can  see  effects  of  adding  the  gender  proportional 
distribution  constraint,  demand  is  met,  satisfaction  is  95.678%. 


Female  distribution 


before  constraint 


After  constraint 


BBR 

ASGN 

FFMS 

FGOAL 

DELTA 

BBR 

ASGN 

FFMS 

FGOAL 

DELTA 

AD 

82 

15 

15 

0 

AD 

82 

15 

15 

0 

AG 

144 

69 

26 

43 

AG 

144 

26 

26 

0 

AR 

128 

0 

0 

0 

AR 

128 

0 

0 

0 

AV 

146 

15 

26 

-11 

AV 

146 

26 

26 

0 

CM 

67 

10 

12 

-2 

CM 

67 

12 

12 

0 

CHOICE 

CUMULATIVE 

EN 

189 

23 

34 

-11 

EN 

189 

34 

34 

0 

1st 

60.629 

FA 

233 

0 

0 

0 

FA 

233 

0 

0 

0 

2nd 

85.383 

FI 

35 

4 

6 

-2 

FI 

35 

6 

6 

0 

3rd 

95.678 

IN 

282 

0 

0 

0 

IN 

282 

0 

0 

0 

4th 

97.957 

Ml 

362 

62 
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Figure  14,  Distribution  of  females  and  effects  on  satisfaction. 

An  example  of  a  demand  side  constraint  (called  a  ‘side  constraint’  in  NETFLOW)  is  demonstrated  here  using  the 
OMS  scores  of  each  candidate  to  calculate  the  average  OMS  score  of  each  branch  (demand  side)  after  the 
assignments  are  made.  The  objective  of  such  a  constraint  will  be  to  make  the  average  OMS  scores  of  each  branch 
assignment  more  nearly  equal  so  that  one  branch  does  not  get  a  disproportionate  share  low  or  high  scoring  cadets. 

The  code  for  the  proc  sql  ‘counting’  and  calculations  to  do  support  this  demand  side  constraint  Is  shown  below.  Note 
that  the  problem  will  fall  If  you  demand  that  the  average  OMS  score  per  branch  Is  exactly  equal  to  the  overall  cadet 
sample  average,  so  we  need  to  ‘factor  back  from  1  ’  to  get  a  slightly  lower  quality  goal  per  branch. 

proc  sql;  select  avgfomsnbr)  into  : avg_oms  from  cadets;  quit; 

data  goals; 
set  goalst; 

qgoals=0 . 998*goalnbr*&avg_oms ; 
run ; 

The  ‘side  constraint’  simply  sums  the  assigned  cadets  OMS  scores  to  satisfy  the  constraint  below. 

constraint  qualityfi  in  B}:  sum{j  in  C } omsnbr [ j ] *x [ i, j ] >=qgoals [ i ] ; 

The  result  of  re-running  the  problem  with  these  constraints  Is  shown  below  In  figure  15.  You  can  see  from  figures  14 
and  15  that  as  you  add  In  constraints  to  shape  the  assignments  with  the  additional  constraints  of  proportionally 
distributing  females  or  more  evenly  distributing  OMS  scores  among  the  branches,  our  ability  to  satisfy  cadets  and 
meet  the  demand  constraint  Is  degraded  (about  1 .3%  for  the  female  proportional  distribution  and  about  5%  for  the 
quality  constraint).  Additionally,  for  the  quality  ‘side  constraint’  our  ability  to  meet  the  demand  is  degraded,  but  the 
SAS  facility  supports  any  additional  logic  necessary  to  disambiguate  fractional  assignments  (an  exercise  left  for  the 
reader,  but  fully  supported  by  the  details  in  this  document  (see  the  random  number  added  to  the  sol  data  set  earlier)). 

Quality  distribution  | 

before  constraint  I  After  constraint  Satisfaction  I  Demand 


14 


Figure  15,  Distribution  of  Quality  and  effects  on  satisfaction. 


CONCLUSION 

This  paper  described  the  business  probiem  of  matching  cadets  to  their  assignment  preferences  as  constrained  by 
avaiiabie  assignments  and  other  Army  needs.  Then  the  paper  described  a  generaiized  network  optimization 
approach  (via  nodes  and  arcs)  to  soiving  this  probiem  initiaiiy  using  the  NETFLOW  procedure,  but  evoiving  that 
same  probiem  to  the  OPTMODEL  procedure  showing  the  effects  of  a  suppiy  side  constraint  (proportionai  distribution 
based  on  gender)  and  a  demand  side  constraint  (quaiity  distribution).  The  SAS®  faciiities  abiiity  to  take  data  into  data 
sets  from  aimost  any  data  source,  then  via  the  OPTMODEL  procedure,  feed  them  to  the  Operations  Research  (OR) 
index  sets  form  for  optimization  (the  OPTMODEL  syntax  appears  to  be  based  on  AMPL'  or  GMPL^),  and  back  out  of 
OPTMODEL  into  data  sets  for  resuits  anaiysis  is  a  capabiiity  unmatched  in  the  OR  community. 
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